Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection
نویسندگان
چکیده
The use of complementary information, namely depth or thermal has shown its benefits to salient object detection (SOD) during recent years. However, the RGB-D RGB-T SOD problems are currently only solved independently, and most them directly extract fuse raw features from backbones. Such methods can be easily restricted by low-quality modality data redundant cross-modal features. In this work, a unified end-to-end framework is designed simultaneously analyze tasks. Specifically, effectively tackle multi-modal features, we propose novel multi-stage multi-scale fusion network (MMNet), which consists module (CMFM) bi-directional decoder (BMD). Similar visual color stage doctrine in human system (HVS), proposed CMFM aims explore important feature representations response stage, integrate into adversarial combination stage. Moreover, BMD learns multi-level fused capture both local global information objects, further boost performance. cross-modality analysis based on two-stage used for diverse Comprehensive experiments ( $\sim 92\text{K}$ image-pairs) demonstrate that method consistently outperforms other 21 state-of-the-art nine benchmark datasets. This validates our work well tasks with good generalization robustness, provides benchmark.
منابع مشابه
RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
In this work, we propose to utilize Convolutional Neural Networks (CNNs) to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the " data-hungry " nature of CNNs and the unavailability of suff...
متن کاملMulti-modal Unsupervised Feature Learning for RGB-D Scene Labeling
Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly learning features from raw RGB-D data, but the performance is not satisfactory. In this paper, we adapt the unsupervised feature learning technique for RGB-D labeling as a multi-modality learn...
متن کاملLocal Background Enclosure for RGB-D Salient Object Detection - Supplementary Results
The purpose of this supplementary material is to examine in detail the contributions of our proposed Local Background Enclosure (LBE) feature. A comparison of LBE with the contrast based depth features used in state-of-the-art salient object detection systems is presented. The LBE feature is compared with the raw depth features ACSD [1], DC [3] and a signed version of DC denoted SDC on the RGBD...
متن کاملCorrelated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition
In this paper, we propose a correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modalspecific ...
متن کاملLearning Graph Matching for Object Detection from Rgb-d Images
We propose an optimization method for estimating parameters in graph-theoretical formulations of the matching problem for object detection. Unlike several methods which optimize parameters for graph matching in a way to promote correct correspondences and to restrict wrong ones, our approach aims at improving performance in the more general task of object detection. In our formulation, similari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology
سال: 2022
ISSN: ['1051-8215', '1558-2205']
DOI: https://doi.org/10.1109/tcsvt.2021.3082939